An Empirical Study for Determining Relevant Features for Sentiment Summarization of Online Conversational Documents

نویسندگان

  • Gino Mangnoesing
  • Arthur H. van Bunningen
  • Alexander Hogenboom
  • Frederik Hogenboom
  • Flavius Frasincar
چکیده

The phenomenon of big data makes managing, processing, and extracting valuable information from the Web an increasingly challenging task. As such, the abundance of user-generated content with opinions about products or brands requires appropriate tools in order to be able to capture consumer sentiment. Such tools can be used to aggregate content by means of sentiment summarization techniques, extracting text segments that reflect the overall sentiment of a text in a compressed form. We explore what features distinguish relevant from irrelevant text segments in terms of the extent to which they reflect the overall sentiment of conversational documents. In our empirical study on a collection of Dutch conversational documents, we find that text segments with opinions, segments with arguments supporting these opinions, segments discussing aspects of the subject of a text, and relatively long sentences are key indicators for text segments that summarize the sentiment conveyed by a text as a whole.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Biogeography-Based Optimization Algorithm for Automatic Extractive Text Summarization

    Given the increasing number of documents, sites, online sources, and the users’ desire to quickly access information, automatic textual summarization has caught the attention of many researchers in this field. Researchers have presented different methods for text summarization as well as a useful summary of those texts including relevant document sentences. This study select...

متن کامل

A Supervised Method for Constructing Sentiment Lexicon in Persian Language

Due to the increasing growth of digital content on the internet and social media, sentiment analysis problem is one of the emerging fields. This problem deals with information extraction and knowledge discovery from textual data using natural language processing has attracted the attention of many researchers. Construction of sentiment lexicon as a valuable language resource is a one of the imp...

متن کامل

Feature-based Product Review Summarization Utilizing User Score

With the steadily increasing volume of e-commerce transactions, the amount of userprovided product reviews is increasing on the Web. Because many customers feel that they can purchase product based on the experiences of others that are obtainable through product reviews, the review summarization process has become important. In particular, feature-based product review summarization is needed in...

متن کامل

Sentiment Strength Prediction Using Auxiliary Features

With an increasingly large amount of sentimental information embedded in online documents, sentiment analysis is quite valuable to product recommendation, opinion summarization, and so forth. Different from most works on identifying documents’ qualitative affective information, this research focuses on the measurement of users’ intensity over each sentimental category. Affect indicates positive...

متن کامل

Text Summarization Using Cuckoo Search Optimization Algorithm

Today, with rapid growth of the World Wide Web and creation of Internet sites and online text resources, text summarization issue is highly attended by various researchers. Extractive-based text summarization is an important summarization method which is included of selecting the top representative sentences from the input document. When, we are facing into large data volume documents, the extr...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012